arc: emit clobber of CC for -mcpu=em x >> 31 #186

exur00 · 2025-10-28T13:53:05Z

Fixed solution that was merged in #181 but was reverted because it caused issues. It has been built and tested on the jenkins without issues.

MichielDerhaeg

Implementation looks fine. Let's improve the commit message a bit. Help the reviewer understand what you did.

If the target has no barrel shifter, clobber is now emitted in a parallel RTL expression. si3_loop matches the parallel expression and emits the necessary expressions that may or may not clobber the CC. This happens in split1, before registers are assigned and dead stores are removed. If the clobber was unnecessary, its consequences will be removed by those steps.

A bit too verbose IMO. Just describe the problem we're solving:

Devices without a barrel shifter end up using a sequence of
instructions. These can use the condition codes and/or loop count
register, so those need to be marked as 'clobbered'. These clobbers were previously added only after split1, which is too late. This patch adds these clobbers from the beginning, in the define_expand.

Explain succinctly the changes you did, why did you merge _nobs and _loop?

si3_cnt1_clobber is placed below all the specific si3_cnt1 patterns.
It matches the si3_loop pattern in the case operand2 has become
const_int 1, and remove the clobber so the specific si3_cnt1 patterns
can be matched.
If si3_cnt1_clobber is placed above these patterns, some si3_cnt
would try to emit extra clobbers to match it instead of their more
specific pattern that match exactly. This would cause clobbers of hard
registers, causing failure.

This is a bit hard to parse. But I don't think you need to mention this. I think the comment you wrote in the code is sufficient.

luismgsilva · 2025-10-31T11:53:19Z

Implementation looks fine. Let's improve the commit message a bit. Help the reviewer understand what you did.

If the target has no barrel shifter, clobber is now emitted in a parallel RTL expression. si3_loop matches the parallel expression and emits the necessary expressions that may or may not clobber the CC. This happens in split1, before registers are assigned and dead stores are removed. If the clobber was unnecessary, its consequences will be removed by those steps.

A bit too verbose IMO. Just describe the problem we're solving:

Devices without a barrel shifter end up using a sequence of
instructions. These can use the condition codes and/or loop count
register, so those need to be marked as 'clobbered'. These clobbers were previously added only after split1, which is too late. This patch adds these clobbers from the beginning, in the define_expand.

Explain succinctly the changes you did, why did you merge _nobs and _loop?

si3_cnt1_clobber is placed below all the specific si3_cnt1 patterns.
It matches the si3_loop pattern in the case operand2 has become
const_int 1, and remove the clobber so the specific si3_cnt1 patterns
can be matched.
If si3_cnt1_clobber is placed above these patterns, some si3_cnt
would try to emit extra clobbers to match it instead of their more
specific pattern that match exactly. This would cause clobbers of hard
registers, causing failure.

This is a bit hard to parse. But I don't think you need to mention this. I think the comment you wrote in the code is sufficient.

This, and perhaps a before and after assembly example might help.

Address the issue explained here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120375 Devices without a barrel shifter end up using a sequence of instructions. These can use the condition codes and/or loop count register, so those need to be marked as 'clobbered'. These clobbers were previously added only after split1, which is too late. This patch adds these clobbers from the beginning, in the define_expand. Previously, define_insn_and_split *<insn>si3_nobs would match any shift or rotate instruction and would generate the necessary patterns to emulate a barrel shifter, but it did not have any output assembly for itself. In many cases this would create a loop with parallel clobbers. This pattern is then matched by the <insn>si3_loop pattern. In the no-barrel-shifter.c test tree code: ;; no-barrel-shifter.c:9: int sign = (x >> 31) & 1; _2 = x.0_1 >> 31; in the expand pass becomes the following pattern that matches *lshrsi3_nobs: (insn 18 17 19 4 (set (reg:SI 153 [ _2 ]) (lshiftrt:SI (reg/v:SI 156 [ x ]) (const_int 31 [0x1f]))) "test2.c":9:24 -1 (nil)) This pattern misses the necessary clobbers and remains untouched until the split1 pass. Together with the later branch it becomes ;; no-barrel-shifter.c:9: int sign = (x >> 31) & 1; add.f 0,r0,r0 ;; no-barrel-shifter.c:14: if (mag == 0x7f800000) beq.d .L8 ;; no-barrel-shifter.c:9: int sign = (x >> 31) & 1; rlc r0,0 Leading to an issue: the add.f instructions overwrites CC but beq expects CC to contain an earlier value indicating mag == 0x7f800000. Now, these are combined in define_insn_and_split <insn>si3_loop that is explicitly emitted in the define_expand and already contains the clobbers. This can then be split into another pattern or remain the loop pattern. In the expand pass, the same example now becomes: (insn 18 17 19 4 (parallel [ (set (reg:SI 153 [ _2 ]) (lshiftrt:SI (reg/v:SI 156 [ x ]) (const_int 31 [0x1f]))) (clobber (reg:SI 60 lp_count)) (clobber (reg:CC 61 cc)) ]) "test2.c":9:24 -1 (nil)) Because the correct clobbers are now taken into account, the branch condition is reevaluated by using breq instead of br. ;; no-barrel-shifter.c:9: int sign = (x >> 31) & 1; add.f 0,r0,r0 rlc r0,0 ;; no-barrel-shifter.c:14: if (mag == 0x7f800000) breq r2,2139095040,.L8 Regtested for arc. Co-authored-by: Keith Packard <[email protected]> Signed-off-by: Loeka Rogge <[email protected]>

exur00 · 2025-11-03T11:57:43Z

I rewrote the commit with a better explanation and some RTL and assembly from the test. I also added keithp (who reported the bug) as co-author.

luismgsilva

LGTM. Great commit message.

exur00 requested a review from MichielDerhaeg October 30, 2025 10:53

exur00 self-assigned this Oct 30, 2025

MichielDerhaeg requested review from Alexehv77 and luismgsilva October 31, 2025 11:05

MichielDerhaeg requested changes Oct 31, 2025

View reviewed changes

exur00 force-pushed the loeka/no-barrel-shifter branch 3 times, most recently from a035cf1 to b25c113 Compare November 3, 2025 11:52

exur00 force-pushed the loeka/no-barrel-shifter branch from b25c113 to 3f600ca Compare November 3, 2025 11:55

MichielDerhaeg approved these changes Nov 4, 2025

View reviewed changes

luismgsilva approved these changes Nov 4, 2025

View reviewed changes

exur00 merged commit bcbd1fa into arc-2026.03 Nov 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

arc: emit clobber of CC for -mcpu=em x >> 31 #186

arc: emit clobber of CC for -mcpu=em x >> 31 #186

Uh oh!

exur00 commented Oct 28, 2025

Uh oh!

MichielDerhaeg left a comment •

edited

Loading

Uh oh!

luismgsilva commented Oct 31, 2025

Uh oh!

exur00 commented Nov 3, 2025

Uh oh!

luismgsilva left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

arc: emit clobber of CC for -mcpu=em x >> 31 #186

arc: emit clobber of CC for -mcpu=em x >> 31 #186

Uh oh!

Conversation

exur00 commented Oct 28, 2025

Uh oh!

MichielDerhaeg left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

luismgsilva commented Oct 31, 2025

Uh oh!

exur00 commented Nov 3, 2025

Uh oh!

luismgsilva left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

MichielDerhaeg left a comment •

edited

Loading